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A  chronically  weak  area  in  research  papers,  reports,  and  reviews 
is  the  complete  identification  of  seminal  background  documents 
that  formed  the  building  blocks  for  these  papers.  A  method  for 
systematically  determining  these  seminal  references  is  presented. 
Citation- Assisted  Background  (CAB)  is  based  on  the  assumption 
that  seminal  documents  tend  to  be  highly  cited.  Application  of  CAB 
to  the  field  of  Anthrax  research  is  presented.  While  CAB  is  a  highly 
systematic  approach  for  identifying  seminal  references,  it  is  not 
a  substitute  for  the  judgment  of  the  researchers,  and  serves  as  a 
supplement. 
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INTRODUCTION 

Research  is  a  method  of  systematically  exploring  the  un¬ 
known  to  acquire  knowledge  and  understanding.  Efficient  re¬ 
search  requires  awareness  of  all  prior  research  and  technol¬ 
ogy  that  could  impact  the  research  topic  of  interest,  and  builds 
upon  these  past  advances  to  create  discovery  and  new  advances. 
The  importance  of  this  awareness  of  prior  art  is  recognized 
throughout  the  research  community.  It  is  expressed  in  diverse 
ways,  including  requirements  for  Background  sections  in  jour¬ 
nal  research  articles,  invited  literature  surveys  in  targeted  re¬ 
search  areas,  and  required  descriptions  of  prior  art  in  patent 
applications. 

For  the  most  part,  development  of  Background  material  for 
any  of  the  above  applications  is  relatively  slow  and  labor  inten¬ 
sive,  and  limited  in  scope.  Background  material  development 
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usually  involves  some  combination  of  manually  sifting  through 
outputs  of  massive  computer  searches,  manually  tracking  ref¬ 
erences  through  multiple  generations,  and  searching  ones  own 
records  for  personal  references.  The  few  studies  that  have  been 
done  on  the  adequacy  of  Background  material  in  documents 
show  that  only  a  modest  fraction  of  relevant  material  is  included 
(MacRoberts  and  MacRoberts  1989;  1996;  Liu  1993;  Caine  and 
Caine  1992;  Shadish  et  al.  1995;  Moravcsik  and  Murugesan 
1975). 

In  particular,  an  analysis  of  Medline  papers  on  the  hemody¬ 
namic  response  to  orotracheal  intubation  showed  that  recognized 
deficiencies  in  research  method  were  not  acknowledged.  The  au¬ 
thors  recommended  that,  when  submitting  work  for  publication, 
investigators  should  provide  evidence  of  how  they  searched  for 
previous  work  (Smith  and  Goodman  1997). 

Another  specific  example  was  provided  by  MacRoberts  and 
MacRoberts  (1997).  Replicating  their  earlier  work  in  a  journal 
on  genetics  that  indicated  that  only  30%  of  influences  evident 
in  text  are  reflected  in  a  paper’s  references,  the  text  of  an  issue 
of  Sida  was  studied  by  the  MacRoberts  to  extract  influences  of 
previous  work  evident  therein.  Influences  they  judged  present  in 
the  text  appeared  in  the  references  only  29%  of  the  time. 

Typically  missing  from  standard  Background  section  or  re¬ 
view  article  development,  as  well  as  in  the  specific  examples 
cited  above,  is  a  systematic  approach  for  identifying  the  key  doc¬ 
uments  and  events  that  provided  the  groundwork  for  the  research 
topic  of  interest.  The  present  paper  presents  such  a  systematic 
approach  for  identifying  the  key  documents,  called  Citation- 
Assisted  Background  (CAB).  The  next  section  describes  the 
CAB  concept,  and  provides  an  outline  of  its  operation,  with  an 
application  to  the  area  of  Anthrax  research. 

CONCEPT  DESCRIPTION 

The  CAB  concept  (Kostoff  and  Shlesinger  2004)  identifies 
the  seminal  Background  documents  for  a  research  area  using 
citation  analysis.  CAB  rests  on  the  assumption  that  a  document 
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that  is  a  significant  building  block  for  a  specific  research  area  will 
typically  have  been  referenced  positively  by  a  substantial  num¬ 
ber  of  people  who  are  active  researchers  in  that  specific  area. 
Implementation  of  the  CAB  concept  then  requires  the  following 
steps: 

•  The  research  area  of  interest  must  be  defined  clearly 

•  The  documents  that  define  the  area  of  interest  must  be 
identified  and  retrieved 

•  The  references  most  frequently  used  in  these  docu¬ 
ments  must  be  identified  and  selected 

•  These  critical  references  must  be  analyzed,  and  inte¬ 
grated  in  a  cohesive  narrative  manner  to  form  a  com¬ 
prehensive  Background  section  or  separate  literature 
survey 

These  required  steps  are  achieved  in  the  following  manner. 

1.  The  research  topic  of  interest  is  defined  clearly  by  the  re¬ 
searchers  who  are  documenting  their  study  results.  For  ex¬ 
ample,  consider  the  topic  of  Anthrax.  In  a  recent  text  mining 
study  of  Anthrax,  the  topical  area  was  defined  to  include 
Anthrax  research,  clinical  issues,  and  terrorist-related  issues. 

2.  The  topical  definition  is  sharpened  further  by  the  develop¬ 
ment  of  a  literature  retrieval  query.  In  the  text  mining  study 
mentioned  above,  the  literature  retrieval  query  was  “anthrax 
OR  anthracis  OR  anthraxin.”  Because  of  the  relatively  sharp 
focus  of  Anthrax,  the  query  is  quite  small.  Other  text  min¬ 
ing  queries  for  broader  literatures  have  required  hundreds  of 
query  terms  (Kostoff  et  al.  1998,  2000,  2004). 

3.  The  query  is  entered  into  a  database  search  engine,  and  doc¬ 
uments  relevant  to  the  topic  are  retrieved.  In  the  Anthrax 
text  mining  study  mentioned  above,  1834  documents  were 
retrieved  from  the  Web  version  of  the  Science  Citation  In¬ 
dex  (SCI)  for  the  years  1991-late  2003.  The  SCI  was  used 
because  it  is  the  only  major  research  database  to  contain  ref¬ 
erences,  in  a  readily  extractable  format. 

4.  These  documents  are  combined  to  create  a  separate  database, 
and  all  the  references  contained  in  these  documents  are  ex¬ 
tracted.  Identical  references  are  combined,  the  number  of 
occurrences  of  each  reference  is  tabulated,  and  a  table  of 
references  and  their  occurrence  frequencies  is  constructed. 
In  the  anthrax  text  mining  study,  25258  separate  references 
were  extracted  and  tabulated.  Table  1  contains  the  ten  highest 
frequency  (most  cited)  references  extracted  from  the  Anthrax 
database. 

Two  frequencies  are  computed  for  each  reference,  but  only  the 
first  is  shown  in  Table  1 .  The  frequency  shown  in  the  rightmost 
column  is  the  number  of  times  each  reference  was  cited  by  the 
1834  records  in  the  retrieved  database  only.  This  number  reflects 
the  importance  of  a  given  reference  to  the  specific  discipline  of 
Anthrax.  The  second  frequency  number  (not  shown)  is  the  total 
number  of  citations  the  reference  received  from  all  sources,  and 
reflects  the  importance  of  a  given  reference  to  all  the  fields  of 


TABLE  1 


Most  highly  cited  documents 


Author 

Year  Source 

Vol 

Page 

#CIT 

Inglesby  TV 

1999  IAMA-J  Am 

Med  Assoc 

281 

1735 

173 

Leppla  SH 

1982  P  Natl  Acad  Sci  USA 

79 

3162 

164 

Friedlander  AM 

1986  J  Biol  Chem 

261 

7123 

158 

Duesbery  NS 

1998  Science 

280 

734 

150 

Meselson  M 

1994  Science 

266 

1202 

143 

Dixon  TC 

1999  New  Engl  J  Med 

341 

815 

137 

Green  BD 

1985  Infect  Immun 

49 

291 

111 

Petosa  C 

1997  Nature 

385 

833 

105 

Mike  sell  P 

1983  Infect  Immun 

39 

371 

99 

Milne  JC 

1994  J  Biol  Chem 

269  20607 

97 

science  that  cited  the  reference.  This  number  is  obtained  from 
the  citation  field  or  citation  window  in  the  SCI.  In  CAB,  only 
the  first  frequency  is  used,  since  it  is  topic-specific.  Using  the 
first  discipline-specific  frequency  number  obviates  the  need  to 
normalize  citation  frequencies  for  different  disciplines  (due  to 
different  levels  of  activity  in  different  disciplines),  as  would  be 
the  case  if  total  citation  frequencies  were  used  to  determine  the 
ordering  of  the  references. 

Before  presenting  a  specific  implementation  algorithm  for  the 
Anthrax  application,  a  few  caveats  will  be  discussed.  First,  list¬ 
ing  and  selection  of  the  most  highly  cited  references  are  depen¬ 
dent  on  the  comprehensiveness  and  balance  of  the  total  records 
retrieved.  Any  imbalances  (from  skewed  databases  or  incorrect 
queries)  can  influence  the  weightings  of  particular  references, 
and  result  in  some  references  exceeding  the  selection  thresh¬ 
old  where  not  warranted,  and  others  falling  below  the  threshold 
where  not  warranted. 

Second,  it  is  important  that  the  query  used  for  record  re¬ 
trieval  be  extensive  (Khan  and  Khor  2004;  Harter  and  Hert  1997; 
Kantor  1994),  as  was  shown  for  the  anthrax  application.  The 
query  needs  to  be  checked  for  precision  and  recall,  which  be¬ 
comes  complicated  when  assumptions  of  binary  relevance  and 
binary  retrieval  are  relaxed  (Della  Mea  and  Mizzaro  2004). 
There  are  a  multitude  of  issues  to  be  considered  when  evalu¬ 
ating  queries  and  their  impact  on  precision  and  recall.  A  recent 
systems  analytic  approach  to  analyzing  the  information  retrieval 
process  concludes  that,  for  completeness,  the  interaction  of  the 
Environment  and  the  information  retrieval  system  must  be  con¬ 
sidered  in  query  development  (Kagolovsky  and  Moehr  2004). 
The  first  author’s  experiences  (with  the  four  studies  done  so 
far  with  CAB,  including  the  study  reported  in  this  paper)  have 
shown  that  modest  query  changes  may  substitute  some  papers  at 
the  citation  selection  threshold,  but  the  truly  seminal  papers  have 
citations  of  such  magnitude  that  they  are  invulnerable  to  modest 
query  changes.  For  this  reason,  the  cutoff  threshold  for  citations 
has  been,  and  should  be,  set  slightly  lower,  to  compensate  for 
query  uncertainties. 
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Third,  there  may  be  situations  where  at  least  minimal  citation 
representation  is  desired  from  each  of  the  major  technical  thrust 
areas  in  the  documents  retrieved.  In  this  case,  the  retrieved  doc¬ 
uments  could  be  clustered  into  the  major  technical  thrust  areas, 
and  the  CAB  process  could  be  performed  additionally  on  the 
documents  for  each  cluster.  The  additional  references  identified 
with  the  cluster-level  CAB  process,  albeit  with  lower  citations 
than  from  the  aggregated  non-clustered  CAB  process,  would 
then  be  added  to  the  list  obtained  with  the  aggregated  CAB 
process.  The  first  author  has  not  found  this  cluster-level 
CAB  process  necessary  for  any  of  the  disciplines  studied  with 
CAB  so  far. 

Fourth,  there  may  be  errors  in  citation  counts  due  to  refer¬ 
ences  errors,  and  the  subsequent  fragmenting  of  a  reference’s 
occurrence  frequency  metric  into  smaller  metric  values.  Care 
needs  to  be  taken  in  insuring  that  a  given  reference  is  not  fis¬ 
sioned  into  multiple  large  fragments  that  are  not  subsequently 
combined. 

Fifth,  the  CAB  approach  is  most  accurate  for  recent  refer¬ 
ences,  and  its  accuracy  drops  as  the  references  recede  into  the 
distant  past.  This  results  from  the  tendency  of  authors  to  refer¬ 
ence  more  recent  documents  and,  given  the  restricted  real  es¬ 
tate  in  journals,  not  reference  the  original  documents.  To  get 
better  representation,  and  more  accurate  citation  numbers,  for 
early  historical  documents,  the  more  recent  references  need  to 
be  retrieved,  collected  into  a  database,  and  have  their  references 
analyzed  in  a  similar  manner  (essentially  examining  generations 
of  citations). 

Sixth,  high  citation  frequencies  are  not  unique  to  seminal 
documents  only;  different  types  of  references  can  have  high  ci¬ 
tation  frequencies.  Documents  that  contain  critical  research  ad¬ 
vances,  and  were  readily  accessible  in  the  open  literature,  tend 
to  be  cited  highly,  and  represent  the  foundation  of  the  CAB  ap¬ 
proach.  Application  of  CAB  to  three  technical  research  areas 
so  far  (in  addition  to  the  present  Anthrax  study)  shows  that  this 
type  of  document  is  predominant  in  the  highly  cited  references 
list.  Books  or  review  articles  also  appear  on  the  highly  cited 
references  list.  These  documents  do  not  usually  represent  new 
advances,  but  rather  are  summaries  of  the  state  of  the  art  (and 
its  Background)  at  the  time  the  document  was  written.  These 
types  of  documents  are  still  quite  useful  as  Background  mate¬ 
rial.  Finally,  documents  that  receive  large  numbers  of  citations 
highly  critical  of  the  document  could  be  included  in  the  list  of 
highly  cited  documents.  In  three  studies  so  far,  the  first  author 
has  not  identified  such  papers  in  the  detailed  development  of  the 
Background. 

Additionally,  one  of  the  three  application  studies  concerns 
high-speed  compressible  flow,  a  discipline  in  which  the  first  au¬ 
thor  worked  decades  ago.  Using  the  CAB  approach,  the  first 
author  found  that  all  the  key  historical  documents  with  which 
he  was  familiar  were  identified,  and  all  the  historical  documents 
identified  appeared  to  be  important.  Thus,  for  that  data  point 
at  least,  the  weaknesses  identified  above  (imbalances,  under¬ 
valuing  early  historical  references,  unwanted  highly  cited  docu¬ 


ments)  did  not  materialize.  To  insure  that  any  critical  documents 
were  not  missed  because  of  imbalance  problems,  the  threshold 
was  set  a  little  bit  lower  to  be  more  inclusive. 

The  converse  problem  to  multiple  types  of  highly  cited  refer¬ 
ences,  some  of  which  may  not  be  the  seminal  documents  desired, 
is  influential  references  that  do  not  have  substantial  citation  fre¬ 
quencies.  If  the  authors  of  these  references  did  not  publish  them 
in  widely  and  readily  accessible  forums,  or  if  they  do  not  contain 
appropriate  verbiage  for  optimal  query  accessibility,  then  they 
might  not  have  received  large  numbers  of  citations.  Additionally, 
journal  or  book  space  tends  to  be  limited,  with  limited  space  for 
references.  In  this  zero-sum  game  for  space,  research  authors 
tend  to  cite  relatively  recent  records  at  the  expense  of  the  earlier 
historical  records.  Also,  extremely  recent  but  influential  refer¬ 
ences  have  not  had  the  time  to  accumulate  sufficient  citations  to 
be  listed  above  the  selection  threshold  on  the  citation  frequency 
table.  Methods  of  including  these  influential  records  located  at 
the  wings  of  the  temporal  distribution  will  be  described  in  the 
following  implementation  section.  Inclusion  of  the  references 
that  were  not  widely  available  when  published  is  more  problem¬ 
atic,  and  tends  to  rely  on  the  Background  developers'  personal 
knowledge  of  these  documents,  and  their  influence. 

CONCEPT  IMPLEMENTATION 

To  identify  the  total  candidate  references  for  the  Background 
section,  a  table  similar  in  structure  to  Table  1,  but  containing 
all  the  references  from  the  retrieved  records,  is  constructed.  A 
threshold  frequency  for  selection  can  be  determined  by  arbitrary 
inspection  (i.e.,  a  Background  section  consisting  of  150  key  ref¬ 
erences  is  arbitrarily  selected).  The  first  author  has  found  a  dy¬ 
namic  selection  process  more  useful.  In  this  dynamic  process, 
references  are  selected,  analyzed,  and  grouped  based  on  their  or¬ 
der  in  the  citation  frequency  table  until  the  resulting  Background 
is  judged  sufficiently  complete  by  the  Background  developers. 

To  insure  that  the  influential  documents  at  the  wings  of  the 
temporal  distribution  are  included,  the  following  total  process 
is  used.  The  reference  frequency  table  is  ordered  by  inverse  fre¬ 
quency,  as  above,  and  a  high  value  of  the  selection  frequency 
threshold  is  chosen  initially.  Then,  the  table  is  re-ordered  chrono¬ 
logically.  The  early  historical  documents  with  citation  frequen¬ 
cies  substantially  larger  than  those  of  their  contemporaries  are 
selected,  as  are  the  extremely  recent  documents  with  citation  fre¬ 
quencies  substantially  larger  than  those  of  their  contemporaries. 
By  contemporaries,  it  is  meant  documents  published  in  the  same 
time  frame,  not  limited  to  the  same  year.  Then,  the  dynamic 
selection  process  defined  above  is  applied  to  the  early  histor¬ 
ical  references,  the  intermediate  time  references  (those  falling 
under  the  high  frequency  threshold),  and  the  extremely  recent 
references. 

Table  2  contains  the  final  references  selected  for  the  Anthrax 
Background  survey.  The  first  reference  listed,  Koch’s  1876  pa¬ 
per,  had  many  more  citations  (ten)  than  any  papers  published 
in  the  1860s  or  1870s.  In  fact,  there  were  half  a  dozen  papers 
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TABLE  2 

Seminal  documents  selected  for  inclusion  in  background 
Author  Year  Source  Vol  Page  #CIT 


Koch  R 

1876 

Pasteur  L 

1881 

Sterne  M 

1939 

Gladstone  GP 

1946 

Smith  H 

1954 

Belton  FC 

1954 

Smith  H 

1955 

Henderson  DW 

1956 

Ross  JM 

1957 

Albrink  WS 

1960 

Stanley  JL 

1961 

Brachman  PS 

1962 

Beall  FA 

1962 

Smith  H 

1962 

Puziss  M 

1963 

Laforce  FM 

1969 

Laemmli  UK 

1970 

Vanness  GB 

1971 

Miller  JH 

1972 

Sanger  F 

1977 

Kaneko  T 

1978 

Gill  DM 

1978 

SekiT 

1978 

Brachman  PS 

1980 

Leppla  SH 

1982 

Mikesell  P 

1983 

Vodkin  MH 

1983 

Ristroph  JD 

1983 

Leppla  SH 

1984 

Hambleton  P 

1984 

Ezzell  JW 

1984 

Green  BD 

1985 

Uchida  1 

1985 

Obrien  J 

1985 

Friedlander  AM 

1986 

Turnbull  PCB 

1986 

Ivins  BE 

1986 

Little  SF 

1986 

Welkos  SL 

1986 

Ivins  BE 

1986 

Robertson  DL 

1986 

Leppla  SH 

1988 

Welkos  SL 

1988 

Gordon  VM 

1988 

Ivins  BE 

1988 

Escuyer  V 

1988 

Robertson  DL 

1988 

Mock  M 

1988 

Makino  S 

1988 

Leppla  SH 

1988 

Turnbull  PCB 

1988 

Singh  Y 

1989 

Sambrook  J 

1989 

Beitr  Biol  Pflanz 
C  R  Acad  Sci  Agr  Bui 
J  Vet  Sci  Anim  Ind 
Br  J  Exp  Pathol 
Nature 

Brit  J  Expt  Patholog 
Brit  J  Exp  Pathol 
JHYG 

J  Pathol  Bacteriol 
Am  J  Pathol 
J  Gen  Microbiol 
Am  J  Public  Health 
J  Bacteriol 
J  Gen  Microbiol 
Appl  Microbiol 
Arch  Environ  Health 
Nature 
Science 

Expt  Mol  Genetics 
P  Natl  Acad  Sci  USA 
Microbiol  Immunol 
Bacterial  Toxins  Cel 
Int  J  Syst  Bacteriol 
Ann  NY  Acad  Sci 
P  Natl  Acad  Sci  USA 
Infect  Immun 
Cell 

Infect  Immun 
Adv  Cyclic  Nucl  Prot 
Vaccine 
Infect  Immun 
Infect  Immun 
J  Gen  Microbiol 
Infect  Immun 
J  Biol  Chem 
Infect  Immun 
Infect  Immun 
Infect  Immun 
Infect  Immun 
Infect  Immun 
Gene 

Method  Enzymol 
Gene 

Infect  Immun 

Eur  J  Epidemiol 

Gene 

Gene 

Gene 

Mol  Microbiol 
Bacterial  Protein  TO 
Med  Microbiol  Immun 
J  Biol  Chem 
Mol  Cloning  Lab  Manu 


2 

277 

10 

92 

429 

15 

13 

307 

24 

27 

394 

14 

173 

869 

31 

35 

144 

26 

36 

460 

22 

54 

28 

34 

73 

485 

38 

36 

457 

21 

26 

49 

55 

52 

632 

69 

83 

1274 

58 

29 

517 

21 

11 

330 

30 

18 

798 

20 

227 

680 

42 

172 

1303 

21 

33 

74 

5463 

22 

22 

639 

37 

291 

26 

28 

182 

23 

353 

83 

51 

79 

3162 

164 

39 

371 

99 

34 

693 

45 

39 

483 

30 

17 

189 

54 

2 

125 

44 

45 

761 

34 

49 

291 

111 

131 

363 

55 

47 

306 

32 

261 

7123 

158 

52 

356 

54 

54 

537 

52 

52 

509 

46 

51 

795 

43 

52 

454 

42 

44 

71 

34 

165 

103 

79 

69 

287 

74 

56 

1066 

70 

4 

12 

39 

71 

293 

38 

73 

363 

35 

64 

277 

34 

2 

371 

33 

111 

33 

177 

293 

32 

264 

19103 

67 

62 
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TABLE  2 

Seminal  documents  selected  for  inclusion  in  background  ( Continued ) 


Author 

Year 

Source 

Vol 

Page 

#CIT 

Makino  S 

1989 

J  Bacteriol 

171 

722 

61 

Blaustein  RO 

1989 

P  Natl  Acad  Sci  USA 

86 

2209 

59 

Bragg  TS 

1989 

Gene 

81 

45 

53 

Singh  Y 

1989 

J  Biol  Chem 

264 

11099 

44 

Bhatnagar  R 

1989 

Infect  Immun 

57 

2107 

39 

Bartkus  JM 

1989 

Infect  Immun 

57 

2295 

32 

Cataldi  A 

1990 

Mol  Microbiol 

4 

mi 

34 

Labruyere  E 

1990 

Biochemistry-US 

29 

4922 

33 

Ivins  BE 

1990 

Infect  Immun 

58 

303 

30 

Pezard  C 

1991 

Infect  Immun 

59 

3472 

95 

Leppla  SH 

1991 

Sourcebook  Bacterial 

277 

89 

Escuyer  V 

1991 

Infect  Immun 

59 

3381 

72 

Ash  C 

1991 

Int  J  Syst  Bacteriol 

41 

343 

72 

Turnbull  PCB 

1991 

Vaccine 

9 

533 

68 

Quinn  CP 

1991 

J  Biol  Chem 

266 

20124 

40 

Koehler  TM 

1991 

Mol  Microbiol 

5 

1501 

38 

Singh  Y 

1991 

J  Biol  Chem 

266 

15493 

34 

Klimpel  KR 

1992 

P  Natl  Acad  Sci  USA 

89 

10277 

96 

Turnbull  PCB 

1992 

J  Appl  Bacteriol 

72 

21 

46 

Ivins  BE 

1992 

Infect  Immun 

60 

662 

38 

Hanna  PC 

1992 

Mol  Biol  Cell 

3 

1269 

37 

Molloy  SS 

1992 

J  Biol  Chem 

267 

16396 

37 

Novak  JM 

1992 

J  Biol  Chem 

267 

17186 

35 

Arora  N 

1992 

J  Biol  Chem 

267 

15542 

35 

Hanna  PC 

1993 

P  Natl  Acad  Sci  USA 

90 

10198 

94 

Friedlander  AM 

1993 

J  Infect  Dis 

167 

1239 

78 

Abramova  FA 

1993 

P  Natl  Acad  Sci  USA 

90 

2291 

69 

Arora  N 

1993 

J  Biol  Chem 

268 

3334 

53 

Milne  JC 

1993 

Mol  Microbiol 

10 

647 

51 

Uchida  I 

1993 

J  Bacteriol 

175 

5329 

48 

Friedlander  AM 

1993 

Infect  Immun 

61 

245 

39 

Pezard  C 

1993 

J  Gen  Microbiol 

139 

2459 

35 

Thorne  CB 

1993 

Bacillus  Subtilis  OT 

113 

31 

Drobniewski  FA 

1993 

Clin  Microbiol  Rev 

6 

324 

30 

Meselson  M 

1994 

Science 

266 

1202 

143 

Milne  JC 

1994 

J  Biol  Chem 

269 

20607 

97 

Klimpel  KR 

1994 

Mol  Microbiol 

13 

1093 

85 

Hanna  PC 

1994 

Mol  Med 

1 

7 

52 

Koehler  TM 

1994 

J  Bacteriol 

176 

586 

46 

Singh  Y 

1994 

J  Biol  Chem 

269 

29039 

36 

Arora  N 

1994 

Infect  Immun 

62 

4955 

32 

Laforce  FM 

1994 

Clin  Infect  Dis 

19 

1009 

31 

Henderson  I 

1994 

Int  J  Syst  Bacteriol 

44 

99 

30 

Sirard  JC 

1994 

J  Bacteriol 

176 

5188 

30 

Milne  JC 

1995 

Mol  Microbiol 

15 

661 

42 

Harrell  LJ 

1995 

J  Clin  Microbiol 

33 

1847 

39 

DaiZH 

1995 

Mol  Microbiol 

16 

1171 

38 

Leppla  SH 

1995 

Handb  Nat  T 

8 

543 

34 

Etiennetoumelin  I 

1995 

J  Bacteriol 

177 

614 

33 

Leppla  SH 

1995 

Bacterial  Toxins  Vir 

543 

30 

Ramisse  V 

1996 

FEMS  Microbiol  Lett 

145 

9 

34 

Andersen  GL 

1996 

J  Bacteriol 

178 

377 

34 

(Continued  on  next  page) 
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TABLE  2 


Seminal  documents  selected  for  inclusion  in  background  ( Continued ) 


Author 

Year 

Source 

Vol 

Page 

#CIT 

Petosa  C 

1997 

Nature 

385 

833 

105 

Franz  DR 

1997 

JAMA-J  Am  Med  Assoc 

278 

399 

69 

Keim  P 

1997 

J  Bacteriol 

179 

818 

53 

Christopher  GW 

1997 

JAMA-J  Am  Med  Assoc 

278 

412 

45 

Torok  TJ 

1997 

JAMA-J  Am  Med  Assoc 

278 

389 

40 

Zilinskas  RA 

1997 

JAMA-J  Am  Med  Assoc 

278 

418 

36 

Duesbery  NS 

1998 

Science 

280 

734 

150 

Vitale  G 

1998 

Biochem  Bioph  Res  Co 

248 

706 

75 

Pile  JC 

1998 

Arch  Intern  Med 

158 

429 

47 

Hanna  P 

1998 

Curr  Top  Microbiol 

225 

13 

41 

Wesche  J 

1998 

Biochemistry-US 

37 

15737 

39 

Benson  EL 

1998 

Biochemistry-US 

37 

3941 

36 

Patra  G 

1998 

J  Clin  Microbiol 

36 

3412 

34 

Jackson  PJ 

1998 

P  Natl  Acad  Sci  USA 

95 

1224 

32 

Ingle  sby  TV 

1999 

JAMA-J  Am  Med  Assoc 

281 

1735 

173 

Dixon  TC 

1999 

New  Engl  J  Med 

341 

815 

137 

Henderson  DA 

1999 

JAMA-J  Am  Med  Assoc 

281 

2127 

54 

Pellizzari  R 

1999 

FEBS  Lett 

462 

199 

52 

Okinaka  RT 

1999 

J  Bacteriol 

181 

6509 

43 

Friedlander  AM 

1999 

JAMA-J  Am  Med  Assoc 

282 

2104 

42 

Guidirontani  C 

1999 

Mol  Microbiol 

31 

9 

41 

Miller  CJ 

1999 

Biochemistry-US 

38 

10432 

31 

Shafazand  S 

1999 

Chest 

116 

1369 

31 

Henderson  DA 

1999 

Science 

283 

1279 

30 

Helgason  E 

2000 

Appl  Environ  Microb 

66 

2627 

69 

Keim  P 

2000 

J  Bacteriol 

182 

2928 

55 

Ingle  sby  TV 

2000 

JAMA-J  Am  Med  Assoc 

283 

2281 

55 

Vitale  G 

2000 

Biochem  J  3 

352 

739 

38 

*CDCP 

2000 

MMWR-Morbid  Mortal  W 

49 

1 

23 

Elliott  JL 

2000 

Biochemistry-US 

39 

6706 

20 

Khan  AS 

2000 

Lancet 

356 

1179 

20 

Turnbull  PCB 

2000 

Curr  Opin  Infect  Dis 

13 

113 

20 

Jemigan  JA 

2001 

Emerg  Infect  Dis 

7 

933 

96 

Bradley  KA 

2001 

Nature 

414 

225 

66 

Mock  M 

2001 

Annu  Rev  Microbiol 

55 

647 

49 

CDCP 

2001 

MMWR-Morbid  Mortal  W 

50 

909 

48 

Pannifer  AD 

2001 

Nature 

414 

229 

46 

Dennis  DT 

2001 

JAMA-J  Am  Med  Assoc 

285 

2763 

42 

Swartz  MN 

2001 

New  Engl  J  Med 

345 

1621 

39 

Sellman  BR 

2001 

Science 

292 

695 

34 

Mourez  M 

2001 

Nat  Biotechnol 

19 

958 

32 

Borio  L 

2001 

JAMA-J  Am  Med  Assoc 

286 

2554 

31 

Arnon  SS 

2001 

JAMA-J  Am  Med  Assoc 

285 

1059 

30 

*CDCP 

2001 

MMWR-Morbid  Mortal  W 

50 

889 

30 

Bush  LM 

2001 

New  Engl  J  Med 

345 

1607 

27 

*CDCP 

2001 

MMWR-Morbid  Mortal  W 

50 

941 

27 

Mayer  TA 

2001 

JAMA-J  Am  Med  Assoc 

286 

2549 

24 

Grinberg  LM 

2001 

Modem  Pathol 

14 

482 

23 

Ingle  sby  TV 

2002 

JAMA-J  Am  Med  Assoc 

287 

2236 

59 

Barakat  LA 

2002 

JAMA-J  Am  Med  Assoc 

287 

863 

23 

Drum  CL 

2002 

Nature 

415 

396 

21 

Read  TD 

2003 

Nature 

423 

81 

8 

Scobie  HM 

2003 

P  Natl  Acad  Sci  USA 

100 

5170 

7 
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published  between  1528  and  1876  that  had  two  citations  each, 
and  these  were  the  closest  to  Koch’s  paper.  This  is  a  graphic  ex¬ 
ample  of  how  we  interpret  a  paper’s  having  substantially  more 
citations  than  its  contemporaries. 

These  results  were  examined  by  the  authors.  They  judged  that 
all  papers  in  the  table  were  relevant  for  a  Background  section,  or 
review  paper.  Due  to  space  considerations,  not  all  papers  listed 
will  be  included  in  the  historical  narrative  shown  in  the  next 
section. 

The  analysis  and  discussion  above  have  focused  on  the  con¬ 
tents  of  the  Background;  i.e.,  which  documents  should  be  in¬ 
cluded.  In  some  cases,  the  Abstracts  of  the  seminal  references 
have  been  retrieved  and  clustered,  to  produce  a  structure  for  the 
Background.  Thus,  the  CAB  approach  can  be  used  to  determine 
both  the  content  and  structure  of  the  Background  section.  Again, 
CAB  does  not  exclude  content  and  structure  determinations  by 
the  experts.  CAB  can  be  viewed  as  the  starting  point  for  content 
and  structure  determination,  upon  which  the  experts  can  build 
with  their  own  insights  and  experience. 

While  the  CAB  approach  is  systematic,  it  is  not  automatic. 
Judgment  is  required  to  determine  when  an  adequate  number 
of  references  has  been  selected  for  the  Background,  and  further 
judgment  is  required  to  analyze,  group,  and  link  the  references 
to  form  a  cohesive  Background  section.  Additionally,  the  highly 
influential  references  that  were  not  highly  cited  due  to  insuffi¬ 
cient  dissemination  should  be  included  by  the  Background  de¬ 
velopers,  if  they  know  of  such  documents.  CAB  is  not  meant 
to  replace  individual  judgment  or  specification  of  Background 
material.  CAB  is  meant  to  augment  individual  judgment  and  ref¬ 
erence  selection,  as  reflected  in  its  name  of  Citation- Assisted. 

ANTHRAX  LITERATURE  REVIEW 
Overview 

Anthrax  is  primarily  a  zoonotic  disease  caused  by  the  spore¬ 
forming  bacterium  Bacillus  anthracis.  The  ability  to  form  spores 
permits  the  organism  to  survive  environmental  conditions  that 
kill  most  other  bacteria.  Dormant  spores  present  in  the  soil  in¬ 
fect  mainly  herbivores  (and  carnivores  that  eat  the  herbivores). 
Spores  can  infect  humans  who  come  in  contact  with  the  infected 
animal  or  its  products  (e.g.,  meat,  hides,  wool,  etc.)  (Boutiba- 
Ben  Boubaker  and  Ben  Redejeb  2001;  Jedrzejas  2002;  Mock 
and  Fouet  2001). 

Anthrax  has  had  a  long  history.  It  was  thought  to  be  responsi¬ 
ble  for  the  5th  and  6th  plagues  in  Egypt  that  were  described  in  the 
Old  Testament.  Subsequently,  there  were  numerous  descriptions 
of  a  disease  resembling  anthrax  in  both  animals  and  humans  in 
the  literature  of  the  Greeks,  Romans,  and  Hindus  (Dirckx  1981). 
In  the  Middle  Ages,  anthrax  swept  across  Europe,  killing  large 
numbers  of  humans  and  animals  (Turnbull  2002).  With  the  in¬ 
dustrialization  of  Europe,  smaller  outbreaks  of  anthrax  began  to 
occur  in  factories  where  imported  animal  hides  and  hair  were 
processed  (Hugh-Jones  1999).  The  association  of  anthrax  with 
wool  led  to  the  name  woolsorters  disease. 


The  study  of  anthrax  led  to  the  development  of  modern  bac¬ 
teriology,  serology,  and  immunology.  The  microorganisms  were 
first  seen  in  1863  by  Davaine,  who  proved  their  infectivity.  For 
an  eloquent  description  of  Davaine’s  discoveries,  see  the  reply 
of  Pasteur  to  a  paper  by  Koch  in  an  Extract  from  The  Scientific 
Review  Paris  of  20  January  1883.  In  1876,  Robert  Koch  isolated 
the  bacillus  in  pure  culture  in  the  vitreous  of  cow’s  eyes  and 
established  Koch’s  postulates  (Koch  1876).  Shortly  thereafter, 
Louis  Pasteur  demonstrated  protection  against  anthrax  following 
immunization  of  sheep  with  a  live  attenuated  bacterial  vaccine 
(Pasteur  1881).  It  wasn’t  until  1954  that  a  toxin  was  shown  to  be 
responsible  for  the  death  of  infected  animals  (Smith  and  Keppie 
1954). 

Anthrax  is  still  enzootic  in  most  developing  countries  and  it 
occurs  sporadically  in  many  other  countries  (Hugh-Jones  1999). 
West  Africa  is  the  most  affected  area  of  the  world  (Davies  1982; 
Hugh-Jones  1999).  Anthrax  remains  a  significant  problem  in 
other  parts  of  Africa,  Central  America,  Spain,  Greece,  Turkey, 
Albania,  Romania,  Central  Asia,  and  the  Middle  East  (Bales 
et  al.  2002;  Cieslak  and  Eitzen  1999;  Hugh-Jones  1999;  Kaya 
et  al.  2002;  Schmidt  and  Kaufman  2002). 

Between  20,000  and  100,000  cases  of  human  anthrax  are  es¬ 
timated  to  occur  worldwide  annually  (Cieslak  and  Eitzen  1999). 
Because  anthrax  remains  a  problem  in  developing  countries, 
animal  products  imported  from  these  areas  continue  to  pose  a 
risk.  Human  cases  occur  infrequently  in  economically  advanced 
countries,  where  animal  anthrax  is  under  control.  The  incidence 
of  infection  has  been  reduced  dramatically  by  vaccination  of 
high-risk  people  and  animals,  along  with  improvements  in  in¬ 
dustrial  hygiene  (Jefferson  et  al.  2000;  Turner  et  al.  1999).  For 
example,  in  the  United  States,  there  were  about  120  cases  per 
year  in  the  early  part  of  the  20th  century,  which  declined  to  less 
than  1  case  per  year  during  the  1990s. 

B.  anthracis  is  an  aerobic  or  facultatively  anaerobic,  large, 
square-ended  Gram-positive  rod  with  a  centrally  located  ellip¬ 
soidal  to  cylindrical  spore.  Recent  taxonomic  studies  indicate 
that  B.  anthracis  is  closely  related  to  Bacillus  cereus  and  Bacil¬ 
lus  thuringiensis  and  these  three  microorganisms  should  be  con¬ 
sidered  a  single  species  (Helagson  et  al.  2000).  Furthermore,  it 
is  likely  that  B.  anthracis  is  a  lineage  of  B.  cereus,  which  has  im¬ 
plications  for  virulence  and  for  horizontal  gene  transfer  within 
this  group  or  organisms  (Helagson  2000).  Chains  of  virulent 
cells  of  B.  anthracis  are  usually  surrounded  by  a  capsule;  avir- 
ulent  strains  are  often  unencapsulated.  Sporulation  occurs  in 
the  soil  and  on  culture  medium  but  not  in  living  tissue,  unless 
exposed  to  air.  Spores  enter  the  human  host  through  breaks  in 
the  skin,  inhalation,  or  by  ingestion,  where  they  are  engulfed 
by  macrophages  or  other  phagocytic  cells.  The  spores  germi¬ 
nate  within  the  phagocytic  cell  forming  encapsulated  vegetative 
cells  that  produce  several  extracellular  protein  toxins  (Brassier 
et  al.  1998;  Brassier  et  al.  2000;  Mourez  et  al.  2002). 

There  are  different  clinical  forms  of  anthrax,  which  reflect  the 
route  by  which  the  spores  entered  the  host.  The  vast  majority  of 
cases  of  naturally  acquired  anthrax  (ca.  95%)  are  the  cutaneous 
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form,  followed  by  the  inhalational,  gastrointestinal  and  other 
rare  forms.  Cutaneous  anthrax  begins  as  a  small,  painless,  but 
often  pruritic  papule.  As  the  papule  enlarges,  it  becomes  vesicu¬ 
lar  and,  within  2  days,  ulcerates  to  form  a  distinctive  black  (hence 
the  name  of  the  disease)  eschar,  with  surrounding  edema.  The 
case  fatality  rate  of  untreated  cutaneous  anthrax  can  be  as  high  as 
25%.  Inhalational  anthrax  begins  with  an  upper-respiratory  flu- 
like  syndrome,  which  after  a  few  days  takes  a  fulminant  course, 
manifested  by  dyspnea,  cough,  chills,  and  a  high-grade  bac¬ 
teremia.  Massive  hilar  adenopathy  and  mediastinal  hemorrhage 
is  evident  in  chest  x-rays  as  a  widening  of  the  hilum,  followed  by 
massive  widening  of  the  mediastinum  with  clear  and  sharp  bor¬ 
ders  (Vessal  1975).  If  not  recognized  and  treated  early,  nearly  all 
patients  with  this  disease  will  die  within  several  days.  Gastroin¬ 
testinal  anthrax  probably  occurs  more  frequently  than  realized. 
Most  cases  are  recognized  after  death  because  clinical  diagno¬ 
sis  is  extremely  difficult.  Many  mild  cases  probably  escape  de¬ 
tection.  In  gastrointestinal  anthrax  there  is  mucosal  ulceration, 
mesenteric  adenitis,  ascites,  cholera-like  diarrhea,  and  moderate 
to  severe  fever  with  chills  relatively  late  in  the  illness  as  a  sign 
of  septicemia,  leukocytosis  and  hemoconcentration.  X-ray  films 
may  show  signs  of  intestinal  obstruction.  The  case  fatality  rate  of 
untreated  gastrointestinal  anthrax  is  >50%.  Prompt  clinical  sus¬ 
picion  and  rapid  administration  of  effective  antimicrobials  are 
essential  for  the  treatment  of  all  forms  of  anthrax.  If  untreated, 
all  forms  of  anthrax  can  lead  to  septicemia  and  death. 

Research  History 

The  major  known  virulence  factors  of  B.  anthracis  are  the 
antiphagocytic  poly-y-D-glutamic  acid  capsule  and  the  toxin 
(Beall  et  al.  1962).  Anthrax  toxin  is  composed  of  three  proteins: 
protective  antigen  (PA),  edema  factor  (EF),  and  lethal  factor 
(LF).  PA  and  EF  comprise  the  edema  toxin  (ET)  and  PA  and 
LF  the  lethal  toxin  (FT).  Both  of  these  toxins  were  shown  to 
contribute  to  the  virulence  of  B.  anthracis ;  however,  it  is  the  FT 
that  causes  death  of  the  infected  host  (Pezard  et  al.  1991).  The 
genes  responsible  for  capsular  biosynthesis  (Green,  1985)  and 
the  synthesis  of  FT  and  ET  (Mikesell,  1983)  are  located  on  large 
plasmids  designated  pX02  and  pXO  1 ,  respectively.  Welkos  et  al. 
(1988)  determined  the  nucleotide  sequence  of  the  gene  encoding 
PA. 

A  number  of  investigators  have  contributed  to  our  understand¬ 
ing  of  how  the  toxin  gains  entry  into  the  cell.  The  observation 
that  PA  blocked  the  action  of  anthrax  toxin  (Singh  et  al.  1989) 
suggested  that  they  recognized  a  common  receptor.  It  was  sub¬ 
sequently  shown  that  PA  binds  to  the  anthrax  toxin  receptor 
(ATR)  (Bradley  et  al.  2001),  is  cleaved  by  a  cell  surface  pro¬ 
tease  with  the  sequence  specificity  and  catalytic  properties  of 
furin  (Klimpel  et  al.  1992),  and  then  binds  FF  and/or  EF,  facili¬ 
tating  internalization  of  these  proteins  into  the  cell  (Singh  et  al. 
1999;  Friedlander,  1986).  ATR  is  a  type  I  membrane  protein 
with  an  extracellular  von  Willebrand  factor  A  domain  that  binds 
directly  to  PA  (Bradley  et  al.  2001)  The  proteolytic  activation 
of  PA  is  a  critical  step  in  the  membrane  insertion  of  EF  and  FF 


(Milne  et  al.  1994).  The  activatedPA  forms  a  multi-subunit,  ring- 
shaped  heptameric  oligomer  during  intoxication  of  mammalian 
cells  (Milne  et  al.  1994).  Using  the  crystal  structure  of  the  PA 
monomer  and  oligomer,  a  model  of  pH-dependent  membrane 
insertion  involving  the  formation  of  a  porin-like,  membrane- 
spanning  beta-barrel  was  proposed  (Petosa  et  al.  1997).  The 
subsequent  translocation  of  FF  and  EF  across  the  cell  mem¬ 
brane  and  into  the  cytosol  is  thought  to  occur  by  a  pH-  and 
voltage-dependent  mechanism  (Zhao  et  al.  1995;  Wesche  et  al. 
1998;  Blaustein  et  al.  1989). 

EF  was  shown  to  have  adenyl  cyclase  activity  and  increase 
cyclic  AMP  concentrations  in  eukaryotic  cells  (Feppla  1982). 
Inhibitors  of  receptor-mediated  endocytosis  blocked  the  entry  of 
EF,  but  not  that  of  the  Bordetella  pertussis  adenyl  cyclase  toxin 
(Gordon  et  al.  1988). 

The  purification  of  FT  has  facilitated  studies  on  its  biological 
activity  (Feppla  1988).  The  mechanism  of  action  of  FF  inside 
the  cell  is  beginning  to  be  understood.  Macrophages  play  a  crit¬ 
ical  role  in  the  pathophysiology  of  anthrax.  Friedlander  used  an 
in  vitro  system  to  demonstrate  that  the  lethality  of  macrophages 
to  FT  occurred  through  an  acid-dependent  process  (Friedlan¬ 
der  1986).  Systemic  shock  and  death  of  the  host  resulted  pri¬ 
marily  from  the  effects  of  high  levels  of  cytokines,  principally 
IF1,  produced  by  macrophages  that  had  been  stimulated  by  FT 
(Hanna  et  al.  1993).  FF  contains  a  zinc  metalloprotease  con¬ 
sensus  sequence  that  is  required  for  FT  activity  (Klimpel  et  al. 
1994).  FF  cleaves  the  amino  terminus  of  mitogen-activated  pro¬ 
tein  kinase  kinases  (MAPKK/MEK),  including  MEK1,  MEK2, 
MKK3,  MKK4,  MKK6,  and  MKK7  but  not  MKK5  inhibiting 
the  MAPK  signal  transduction  pathway  (Duesbery  et  al.  1998; 
Pellizzari  et  al.  1999;  Pellizzari  et  al.  2000;  Vitale  et  al.  2000). 
In  addition  to  cleavage  of  the  N-terminus  of  MAPKKs,  FF  in¬ 
duced  tyrosine/threonine  phosphorylation  of  MAPKs  in  cultured 
macrophages  (Vitale  et  al.  1998).  However,  the  fact  that  FT- 
resistant  and  -sensitive  cells  show  similar  internalization  of  FF 
(Singh  et  al.  1989)  and  similar  MEK  cleavage  in  response  to 
FF  (Pellizzari  et  al.  1999;  Pellizzari  et  al.  2000)  suggests  that 
these  factors  alone  cannot  account  for  differential  susceptibility 
or  resistance  to  FT.  The  completion  of  the  genome  sequence  of 
B.  anthracis  (Read  et  al.  2002,  2003)  will  provide  new  insights 
into  the  pathogenesis  of  this  microorganism. 

Vaccines  have  played  an  important  role  in  controlling  an¬ 
thrax.  The  veterinary  vaccine  that  is  currently  in  use  in  the  U.S. 
is  a  spore  suspension  from  an  avirulent  non-encapsulated  strain 
(Sterne  1939).  The  original  human  anthrax  vaccine  was  devel¬ 
oped  by  George  Wright  in  the  1950s  and  first  produced  on  a  large 
scale  by  Merck.  Brachman  et  al.  (1962)  examined  the  safety  of 
this  vaccine  and  concluded  that  individual  reactions  to  the  vac¬ 
cine  were  relatively  minor.  The  U.S  military  vaccinates  at-risk 
personnel  for  anthrax  in  case  of  a  biological  attack.  Friedlander 
et  al.  (1993)  conducted  a  study  to  determine  whether  a  prolonged 
course  of  post-exposure  antibiotics,  with  or  without  vaccination, 
protected  monkeys  exposed  to  a  lethal  dose  of  anthrax  spores 
when  the  antibiotic  was  discontinued.  It  was  concluded  that  each 
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regimen  completely  protected  animals  while  on  therapy  and  pro¬ 
vided  significant  long-term  protection  upon  discontinuance  of 
the  drug.  The  use  of  the  current  anthrax  vaccine  in  U.S.  military 
personnel  has  become  controversial  due  to  reports  of  adverse  re¬ 
actions.  A  priority  area  for  current  research  is  the  development 
of  a  better  vaccine. 

B.  anthracis  has  many  biological  and  virulence  characteris¬ 
tics  that  have  made  it  attractive  as  a  bioweapon,  In  1979,  an  acci¬ 
dent  occurred  in  a  military  microbiology  facility  in  Sverdlovsk, 
USSR  in  which  a  small  amount  (less  than  1  gram)  of  spores  were 
released  outside  the  facility  generating  an  aerosol  that  resulted 
in  numerous  infections  and  at  least  64  deaths  (Abramova  et  al. 
1993;  Bezdenezhnykh  and  Nikiforov,  1980;  Messelson  et  al. 
1994).  Recently,  there  has  been  considerable  concern  about  the 
use  of  biological  agents  by  terrorists.  B.  anthracis  is  one  of  the 
agents  that  required  enhanced  preparedness  efforts  (Franz  et  al. 
1997;  Inglesby  et  al.  1999).  The  concern  about  bioterrorism  has 
been  heightened  in  the  post-9/1 1  era.  The  mailings  of  letters  con¬ 
taining  spores  of  B.  anthracis  to  the  media  and  members  of  the 
U.S.  Congress  in  September  and  October  of  2001  resulted  in  22 
cases  of  anthrax  (1 1  of  these  were  inhalational)  with  5  deaths  (all 
inhalational),  closed  part  of  the  U.S.  government’s  operations, 
and  terrorized  the  American  public  (Jernigan  et  al.  2001;  Hsu 
et  al.  2002,  Morse  et  al.  2003).  Aggressive  treatment  enabled 
many  of  those  with  inhalational  anthrax  to  survive  (Inglesby 
et  al.  2002).  The  investigation  of  this  attack  used  a  molecular 
typing  method  (variable-number  tandem  repeat  [VNTR]  analy¬ 
sis)  (Keim  et  al.  2000)  to  identify  the  strain  of  B.  anthracis  used 
in  the  attack.  Additional  forensic  information  was  provided  by 
whole  genome  sequencing  (Read  et  al.  2002).  Nevertheless,  the 
perpetrator(s)  of  this  attack  remain  at  large. 

As  a  result  of  the  renewed  interest  in  anthrax,  there  have 
been  a  number  of  recent  review  articles,  which  provide  compre¬ 
hensive  and  complementary  perspectives  on  this  disease  (Dixon 
et  al.  1999;  Mock  and  Fouet  2001;  Gardner  2001;  Khanna  and 
Singh  2001;  Oncu  et  al.  2003).  These  review  articles  are  struc¬ 
tured  along  traditional  lines  in  that  they  cover  the  etiology  and 
pathologic  mechanisms  of  anthrax,  addressing  both  biological 
and  medical  considerations.  However,  none  of  these  reviews  pro¬ 
vide  the  infrastructure  and  technology  structure  of  the  anthrax 
literature  that  text  mining  can  provide. 
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